home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Games of Daze
/
Infomagic - Games of Daze (Summer 1995) (Disc 1 of 2).iso
/
x2ftp
/
msdos
/
formats
/
objlib
/
obj-lib.doc
< prev
next >
Wrap
Text File
|
1989-02-18
|
64KB
|
1,431 lines
Linker Overview
Linker Overview
- The primary purpose of a linker is to produce an file which can be
executed by an operating system.
- In DOS, executable files come in three flavors:
"EXE" files
"COM" files
"SYS" files
- The input to the linker comes from module(s) called translator
modules.
- Individual translator modules can be stored in "OBJ" files. "OBJ"
files are produce by language translators such as assemblers and
compilers.
- Libraries of translator modules can be stored in "LIB" files which
are produced by a librarian from either "OBJ" files or other "LIB"
files.
- A linker can process translator modules from either "OBJ" or "LIB"
files, but at least one module must have come from an "OBJ" file.
- All translator modules from "OBJ" files will be included in the
executable file, but only modules which resolve externals come from
"LIB" files.
Comparison of Executable Files
- "EXE" files have a header which contains relocation and other
information which permit "EXE" files to have multiple segments.
This header sets "EXE" files apart from the other executable file
types. Programs are written starting at address 0 (relative to
where the program is loaded in memory).
- "COM" files have no header. "COM" files are written starting a
address 100H because the "COM" file is loaded immediately after the
program segment prefix (PSP) which is 100H bytes long. At the start
of the program, the CS and DS segment registers point to the PSP.
"EXE" files which have an empty relocation table can be converted to
"COM" files by the "EXE2BIN" program.
- "SYS" files are more closely related to "COM" files than "EXE"
files. "SYS" files are loaded once at DOS initialization time.
Since there is no PSP, the programs are written starting at address
0. However, like "COM" files, "SYS" files must have an empty
relocation table.
Contents of Executable Files
"EXE" files "SYS" files
┌──────────────────┐ ┌───────────┐
│ "EXE" header │ │load module├──┐
├──────────────────┤ └───────────┘ │
│ relocation table │ │
├──────────────────┤ ┌─────┐ │
│ load module ├────────────────────┤org 0├────────────────────┘
└──────────────────┘ └─────┘
"COM" files
┌───────────┐ ┌────────┐
│load module├─────────────────┤org 100H│
└───────────┘ └────────┘
- Each of the above contents are described in detail in the following
slides.
"EXE" File Header
┌──────┬───────────────────────────────────────────────────────────────┐
│Offset│ Description │
├──────┼───────────────────────────────────────────────────────────────┤
│00-01 │ The "EXE" file signature 4D5AH. │
├──────┼───────────────────────────────────────────────────────────────┤
│02-03 │ Length of the "EXE" file modulo 512. │
├──────┼───────────────────────────────────────────────────────────────┤
│04-05 │ Number of 512-byte pages. If the last page is not full it is │
│ │ still included in the count. │
├──────┼───────────────────────────────────────────────────────────────┤
│06-07 │ Number of relocation items. │
├──────┼───────────────────────────────────────────────────────────────┤
│08-09 │ Num of 16-byte paras occupied by "EXE" header and relo table. │
├──────┼───────────────────────────────────────────────────────────────┤
│0A-0B │ Number of paragraphs required immediately after load module. │
│ │ The linker computes the number of uninitialized bytes at the │
│ │ end of the load module. Instead of writing these bytes to │
│ │ the "EXE" file, the linker sets this value to provide space │
│ │ for this data. │
├──────┼───────────────────────────────────────────────────────────────┤
│0C-0D │ Max number of paras which may be required immediately after │
│ │ the "EXE" file. This value comes from the CPARMAXALLOC switch.│
├──────┼───────────────────────────────────────────────────────────────┤
│0E-11 │ Offset/Segment displacement into the load module of initial │
│ │ SP/SS/. The displacement is converted to an actual address by │
│ │ adding the base address of the load module. Since there may │
│ │ be several stack segments in the translator modules, the │
│ │ linker uses the highest address of the largest segment with │
│ │ the stack attribute. │
├──────┼───────────────────────────────────────────────────────────────┤
│12-13 │ Word checksum computed as minus the sum of all the words in │
│ │ the file. Overflows are ignored. DOS will not validate the │
│ │ checksum if this value is set to zero │
├──────┼───────────────────────────────────────────────────────────────┤
│14-17 │ Offset/Segment displacement into the load module of the │
│ │ initial IP/CS. The displacement is converted to an actual │
│ │ address by adding the base address of the load module. │
├──────┼───────────────────────────────────────────────────────────────┤
│18-19 │ Lgh of "EXE" file header. (Used to find start of relo table.) │
├──────┼───────────────────────────────────────────────────────────────┤
│1A-1B │ Overlay number. This is zero for the main program. │
├──────┼───────────────────────────────────────────────────────────────┤
│1C-1D │ Always 0001H. │
└──────┴───────────────────────────────────────────────────────────────┘
Note: "EXE" file header routines are in "EXECFILE.C" (handout page 9)
Relocation Table
The number of relocation items is specified at offset 06-07 in the
"EXE" file header. The offset of the relocation table is located at
offset 18-19 in the "EXE" file header. Usually this value is 001EH,
but can be larger if the "EXE" file header has been extended.
The relocation table can be viewed as an array of Offset/Segment
displacements into the load module. For each address, the segment
portion of the base address of the load module is added to the word at
that displacement. The only time a relocation item is needed is when a
fixup involves a segment. As we will see later, this can only be
caused by base and pointer type fixups.
As an example of a how a relocation item is generated, consider the
following:
foo db 'a'
bar dd foo
Note that the contents of "bar" is the address of "foo", but the actual
address of "foo" is not known until the program is loaded in memory.
All that is known at link time is how far (i.e., the displacement) into
the load module "foo" and "bar" are located. Only the displacement of
"foo" is stored at link time. But, the linker makes a relocation entry
showing that "bar"+2 must be relocated.
Load Module
By far, the generation of the load module is the trickest part of
producing an executable file. The contents of the load module is
generated from the contents of translator modules stored in "OBJ" and
"LIB" files. Much of the structure of translator modules comes from
the assembler directives. In turn, those directives are related to the
architecture of the 80x86 family of micros.
Here are some of the important directives:
SEGMENT/ENDS is used to give a logical name along with grouping and
combining information for the segments. The linker is responsible
for arranging and combining these logical segments into the
physical segments which comprise a load module.
EXTRN/PUBLIC is used access data across translator modules.
A detailed description of the format of translator modules follows.
Terminology and Abbreviations
MAS - Memory Address Space: The memory capable of being addressed by
the hardware architecture.
T-MODULE - Translator Module: This is a unit of object code produced
by a language translator. They may be stored individually in "OBJ"
files or collections may be stored in "LIB" files.
FRAME: A contiguous 64K chunk of MAS.
FRAME NUMBER: Paragraph number where a FRAME begins.
CANONIC FRAME: For the 8086, each byte of memory is encompassed by up
to 4096 FRAMEs. The CANONIC FRAME frame is lowest FRAME NUMBER
which encompasses that byte.
LSEG - Logical Segment: Data and code between SEGMENT - ENDS
directives.
PSEG - Physical Segment: A collection of one or more LSEGs placed into
a load module.
Fixup Overview
Not all references to MAS can be resolved at translation time. This
can happen when one T-MODULE must access data located in another
T-MODULE. When this happens, the language translator places an entry
in the T-MODULE so that the linker can complete the address reference.
Such entries in a T-MODULE are called "Fixups".
In order for the linker to complete the address reference, it needs
five pieces of information:
The LOCATION in memory where the reference occurs. This is
the address which must be fixed up.
The type of LOCATION in memory where the reference occurs.
Whether the fixup is relative to the IP or not. This is
refered to as the fixup MODE.
The TARGET address which LOCATION is referencing.
The FRAME number in the segment register used to reference
the TARGET address.
LOCATION Types
There are five types of LOCATIONs. They are POINTER, BASE, OFFSET,
HIBYTE, and LOBYTE. The relative position and length of each LOCATION
type from a LOCATION, X is given below:
┌──────────┬──────────┬──────────┬──────────┐
│ X+0 │ X+1 │ X+2 │ X+3 │
└──────────┴──────────┴──────────┴──────────┘
├──LOBYTE──┤
├──HIBYTE──┤
├───────OFFSET────────┤
├────────BASE─────────┤
├──────────────────POINTER──────────────────┤
Fixup Modes
There are two fixup modes. Both modes stem from the architecture of
the 80x86 microprocessor family. The TARGET may be addressed directly
via the offset/segment mechanism. This fixup mode is called
"segment-relative".
The other manner a TARGET may be addressed is relative to the IP. For
example, the TARGET of the 80x86 jump instructions (e.g., JE, JC, JA,
etc) are all relative to the IP. This fixup mode is called
"self-relative". Note that the self-relative mode is relative to the
value of the IP when the instruction executes. In all cases, the IP
points to the first byte of the instruction following the one being
executed.
TARGET
The TARGET is the location in MAS being referenced by LOCATION. There
are four basic ways of specifying the TARGET. The four methods of
specifying a TARGET are:
TARGET is specified relative to an LSEG.
TARGET is specified relative to a group.
TARGET is specified relative to an external symbol.
TARGET is specified relative to an absolute FRAME.
The four primary methods specify a displacement while the four
secondary methods do not specify a displacement (because the
displacement is 0).
Primary TARGET Methods
┌──────┬─────────────────────────┬─────────────────────────────────────┐
│Method│ Notation │ Description │
├──────┼─────────────────────────┼─────────────────────────────────────┤
│ T0 │SI(segment),displacement │The TARGET is at the specified │
│ │ │displacement in the LSEG segment. │
├──────┼─────────────────────────┼─────────────────────────────────────┤
│ T1 │GI(group),displacement │The TARGET is at the specified │
│ │ │displacement in the group. │
├──────┼─────────────────────────┼─────────────────────────────────────┤
│ T2 │EI(external),displacement│The TARGET is at the specified │
│ │ │displacement past the external. │
├──────┼─────────────────────────┼─────────────────────────────────────┤
│ T3 │FR(frame),displacement │The TARGET is at the specified │
│ │ │displacement past FRAME NUMBER frame.│
└──────┴─────────────────────────┴─────────────────────────────────────┘
Example:
SI(foo),4 means the TARGET is 4 bytes into the LSEG foo. Several
T-MODULES could have an LSEG named foo which may be combined into a
PSEG foo. So, the final displacement in the PSEG foo may not be 4.
The linker must take this into consideration.
Secondary TARGET Methods
┌──────┬────────────┬─────────────────────────────────────────────┐
│Method│ Notation │ Description │
├──────┼────────────┼─────────────────────────────────────────────┤
│ T4 │SI(segment) │The TARGET is the base of the LSEG segment. │
├──────┼────────────┼─────────────────────────────────────────────┤
│ T5 │GI(group) │The TARGET is the base of the group. │
├──────┼────────────┼─────────────────────────────────────────────┤
│ T6 │EI(external)│The TARGET is the specified external. │
├──────┼────────────┼─────────────────────────────────────────────┤
│ T7 │FR(frame) │The TARGET is the specified frame. │
└──────┴────────────┴─────────────────────────────────────────────┘
FRAME
The FRAME portion of a fixup specifies the FRAME NUMBER that will be
used as the frame of reference for LOCATION's reference to TARGET.
Typically, this frame of reference is one of the segment registers.
The FRAME NUMBER in the segment register is specified via the assembler
"ASSUME" directive.
Even if the fixup is self-relative, the TARGET must still be in the
FRAME given by the FRAME NUMBER in the segment register. So, a FRAME
is required for both segment-relative fixups and self-relative fixups.
There are seven methods of specifying a FRAME:
A segment
A group
An external
An absolute FRAME NUMBER
LOCATION's FRAME
TARGET's FRAME
No FRAME specified
FRAME Methods
┌──────┬───────────────────────────────────────────────────────────────┐
│Method│ Description │
├──────┼───────────────────────────────────────────────────────────────┤
│ F0 │The FRAME for the fixup is the CANONIC FRAME of the PSEG │
│ │containing the LSEG. (Since the fixup is generated at │
│ │translation time, an LSEG is specified.) │
│ │Notation: EI(segment) │
├──────┼───────────────────────────────────────────────────────────────┤
│ F1 │The FRAME for the fixup is the CANONIC FRAME of the PSEG │
│ │located lowes in MAS. A group is specified. │
│ │Notation: GI(group) │
├──────┼───────────────────────────────────────────────────────────────┤
│ F2 │The FRAME for the fixup is specified by an external. │
│ │Typically, the external is located in a T-MODULE different │
│ │from the T-MODULE generating the fixup. │
│ │Notation: EI(external) │
├──────┼───────────────────────────────────────────────────────────────┤
│ F3 │The absolute FRAME number is specified. │
│ │Notation: FR(FRAME) │
├──────┼───────────────────────────────────────────────────────────────┤
│ F4 │The FRAME is the CANONIC FRAME of the PSEG containing LOCATION.│
│ │Notation: LOCATION │
├──────┼───────────────────────────────────────────────────────────────┤
│ F5 │The FRAME is determined by the TARGET. Notation: TARGET │
├──────┼───────────────────────────────────────────────────────────────┤
│ F6 │No frame of reference specified. Notation: NONE │
└──────┴───────────────────────────────────────────────────────────────┘
FRAME Method F2 cases
When FRAME method F2 is specified (FRAME is specified by an external),
there are three cases depending on how the external is defined:
┌──────┬───────────────────────────────────────────────────────────────┐
│Method│ Description │
├──────┼───────────────────────────────────────────────────────────────┤
│ F2a │If the external is defined relative to an LSEG which is not in │
│ │a group then the FRAME is the CANONIC FRAME of the PSEG │
│ │containing the LSEG. │
├──────┼───────────────────────────────────────────────────────────────┤
│ F2b │If the external is defined absolutely and not in a group, then │
│ │the FRAME is the CANONIC FRAME of the external. │
├──────┼───────────────────────────────────────────────────────────────│
│ F2c │If the external is associated with a group, the FRAME is the │
│ │CANONIC FRAME of the PSEG in the group with the lowest MAS. │
└──────┴───────────────────────────────────────────────────────────────┘
FRAME Method F5 cases
When FRAME method F2 is specified (FRAME is specified by the TARGET),
there are four cases depending on how the TARGET was specified:
┌──────┬───────────────────────────────────────────────────────────────┐
│Method│ Description │
├──────┼───────────────────────────────────────────────────────────────┤
│ F5a │If the TARGET method is T0 or T4, then FRAME is the CANONIC │
│ │FRAME of PSEG containing TARGET. │
├──────┼───────────────────────────────────────────────────────────────┤
│ F5b │If the TARGET method is T1 or T5, then FRAME is the CANONIC │
│ │FRAME of PSEG in the same group as TARGET and with the lowest │
│ │MAS. │
├──────┼───────────────────────────────────────────────────────────────┤
│ F5c │If the TARGET method is T2 or T6, then the FRAME is determined │
│ │by the rules given in FRAME method F2. │
├──────┼───────────────────────────────────────────────────────────────┤
│ F5d │If the TARGET method is T3 or T7, then the FRAME is the FRAME │
│ │NUMBER specified by the TARGET. │
└──────┴───────────────────────────────────────────────────────────────┘
Performing a Fixup
Regardless of the fixup mode (segment-relative or self-relative), the
first step in performing a fixup is to insure that the TARGET is
addressable given the FRAME of reference. That is, the TARGET must lie
between FRAME and FRAME+65535 inclusive.
FRAME ≤ TARGET ≤ FRAME+65535
If this is not the case, a warning is given.
After verifying that TARGET can be addressed by FRAME, how a fixup is
performed depends of the FIXUP mode.
Self-Relative Fixups
Self-relative fixups are permitted for LOBYTE and OFFSET type LOCATIONs
only. If the LOCATION type is HIBYTE, BASE, or POINTER, no fixup is
performed and an error message is given.
For self-relative fixups, the value of the PC (CS:IP) when the
instruction is executed must be determined. Then, the DISTANCE to the
TARGET from the PC is computed. The formulae for DISTANCE and PC are
given below:
PC = LOCATION + 1 (If LOCATION type is LOBYTE)
PC = LOCATION + 2 (If LOCATION type is OFFSET)
DISTANCE = TARGET - PC
For LOBYTE locations, an error is issued when DISTANCE falls outside of
-128 ≤ DISTANCE ≤ 127.
The fixup is performed by adding DISTANCE to LOCATION. For LOBYTE
LOCATIONs, DISTANCE is added modulo 256. For OFFSET LOCATIONs,
DISTANCE is added modulo 65536.
Segment-Relative Fixups
For segment-relative fixups, DISTANCE = TARGET - FRAME. If DISTANCE
falls outside of 0 ≤ DISTANCE ≤ 65535, a warning is issued. The
following table gives how to perform the fixup depending on LOCATION
type:
┌────────┬─────────────────────────────────────────────────────────────┐
│LOCATION│ Action │
│ type │ │
├────────┼─────────────────────────────────────────────────────────────┤
│LOBYTE │DISTANCE is added (modulo 256) to low order byte at LOCATION.│
├────────┼─────────────────────────────────────────────────────────────┤
│HIBYTE │DISTANCE is added (modulo 256) to high order byte at LOCATION│
├────────┼─────────────────────────────────────────────────────────────┤
│OFFSET │DISTANCE is added (modulo 65536) to the word at LOCATION. │
├────────┼─────────────────────────────────────────────────────────────┤
│BASE │FRAME is added (modulo 65536) to the word at LOCATION. │
├────────┼─────────────────────────────────────────────────────────────│
│POINTER │DISTANCE is added (modulo 65536) low order word at LOCATION. │
│ │A relocation item for the word at LOCATION is created. │
│ │FRAME is added (modulo 65536) to the high order word of the │
│ │DWORD at LOCATION. A relocation item for the high order word│
│ │of the DWORD at LOCATION is created. │
└────────┴─────────────────────────────────────────────────────────────┘
T-MODULE Record Format Basics
All T-MODULE records have the following basic format:
┌──────┬───────┬────────────────────┬─────┐
│Record│ Record│Information Specific│Check│
Field Names─────────>│ Type │ Length│ to Record Type │ Sum │
├──────┼───────┼────────────────────┼─────┤
Field Lengths───────>│ 1 │ 2 │ Record Length - 1 │ 1 │
(bytes) └──────┴───────┴────────────────────┴─────┘
Record Type --
This one byte field identifies the type of T-MODULE record.
Record Length --
This word contains the number of bytes in all following fields
(including the checksum).
Information Specific to Record Type --
This field contains the data for the specified Record Type.
Check Sum --
This byte contains the negative of the sum of all the preceding bytes
in the record. Therefore, the sum of all the bytes in the record
will be 0.
T-MODULE Record Format -- Bit Fields
Bit fields are denoted as follows:
┌─────┬─────┬─────┐
│ Bit │ Bit │ Bit │
Field Name────────────>│Field│Field│Field│
│ 1 │ 2 │ n │
├─────┼─────┼─────┤
Field Length──────────>│ 4 │ 1 │ 3 │
(bits) ├─────┴─────┴─────┤
│ byte │
└─────────────────┘
T-MODULE Record Format -- INDEX Fields
There are three special kinds of fields in a T-MODULE record:
"INDEX" fields: ┌─────┐
│INDEX│
├─────┤
│ 1-2 │
└─────┘
An INDEX field is one or two bytes long. If the high order bit of the
first byte of the INDEX is 0, then the INDEX is one byte long and the
value is the remaining 7 bits (0 - 127). Otherwise, the INDEX is two
bytes long and the value of the INDEX is the low order 7 bits of the
first byte * 256 plus the second byte.
T-MODULE Record Format -- NAME Fields
The format of a "NAME" field is:
┌──────┬───────────┐
│ NAME │ NAME │
│Length│ │
├──────┼───────────┤
│ 1 │NAME Length│
└──────┴───────────┘
Note: NAMEs of 0 bytes are permitted. The NAME is not NULL
terminated.
T-MODULE Record Format -- VALUE Fields
The format of "VALUE" fields is:
┌────┬──────┐
│Code│Number│
├────┼──────┤
│ 1 │ 0-4 │
└────┴──────┘
When 0 ≤ Code ≤ 128, the Number field is omitted and the VALUE is Code.
When Code = 129, the Number field is 2 bytes long, and the VALUE is
Number.
When Code = 132, the Number field is 3 bytes long, and the VALUE is
Number.
When Code = 136, the Number field is 4 bytes long, and the VALUE is
Number.
T-MODULE Record Format -- T-MODULE Header
┌───┬──────┬─────────────┬─────┐
│ │Record│T-MODULE NAME│Check│
│80H│Length│ │ Sum │
├───┼──────┼─────────────┼─────┤
│ 1 │ 2 │ NAME │ 1 │
└───┴──────┴─────────────┴─────┘
This record type must be the first record in the T-MODULE, and it names
the T-MODULE. Frequently, the T-MODULE NAME is the name of the source
file to the language translator.
T-MODULE Record Format -- List of NAMEs (LNAMEs)
┌───┬──────┬────────────┬─────┐
│ │Record│ │Check│
│96H│Length│Logical NAME│ Sum │
├───┼──────┼────────────┼─────┤
│ 1 │ 2 │ NAME │ 1 │
└───┴──────┼────────────┼─────┘
└──repeated──┘
Each Logical NAME is entered into a "List of NAMES" (LNAME) in the
order they are encountered in type 96H records. The list index starts
at 1 (0 means not specified). There may be more than one type 96H
record in a T-MODULE. When this occurs, append the Logical NAMEs to
the list. The Logical NAME field is repeated. The number of
repetitions is determined by the Record Length.
T-MODULE Record Format -- LSEG Definition (SEGDEF)
┌───┬──────┬─────────┬───────┬───────┬─────┬───────┬─────┐
│ │Record│ Segment │Segment│Segment│Class│Overlay│Check│
│90H│Length│Attribute│Length │ INDEX │INDEX│ INDEX │ Sum │
├───┼──────┼─────────┼───────┼───────┼─────┼───────┼─────┤
│ 1 │ 2 │ 1-4 │ 2 │ INDEX │INDEX│ INDEX │ 1 │
└───┴──────┴─────────┴───────┴───────┴─────┴───────┴─────┘
Segment Attribute: ┌────┬────────────┬──────┐ ACBP: ┌─┬─┬─┬─┐
│ACBP│FRAME NUMBER│Offset│ │A│C│B│P│
├────┼────────────┼──────┤ ├─┼─┼─┼─┤
│ 1 │ 2 │ 1 │ │3│3│1│1│
└────┼────────────┴──────┤ ├─┴─┴─┴─┤
└────conditional────┘ │ Byte │
└───────┘
Note: ACBP.P must be 0.
There is one SEGDEF record for each LSEG in a T-MODULE. Like the LNAME
list, this forms a list of LSEGs (indexed from 1).
T-MODULE Record Format -- SEGDEF.ACBP.A
The following table gives the meaning of the A field of the ACBP field:
┌─┬────────────────────────────────────────────────────────────────────┐
│A│ Description │
├─┼────────────────────────────────────────────────────────────────────┤
│0│This is an absolute segment. The FRAME NUMBER and Offset fields of │
│ │the Segment Attribute field will be present. │
├─┼────────────────────────────────────────────────────────────────────┤
│1│This is a relocatable, byte-aligned LSEG. │
├─┼────────────────────────────────────────────────────────────────────┤
│2│This is a relocatable, word-aligned LSEG. │
├─┼────────────────────────────────────────────────────────────────────┤
│2│This is a relocatable, paragraph-aligned LSEG. │
├─┼────────────────────────────────────────────────────────────────────┤
│4│This is a relocatable, page-aligned LSEG. │
├─┼────────────────────────────────────────────────────────────────────┤
│5│This is a relocatable, DWORD-aligned LSEG. │
└─┴────────────────────────────────────────────────────────────────────┘
T-MODULE Record Format -- SEGDEF.ACBP.C
The following table gives the meaning of the C field of the ACBP field:
┌─┬────────────────────────────────────────────────────────────────────┐
│C│ Description │
├─┼────────────────────────────────────────────────────────────────────┤
│0│The LSEG is private and may not be combined. │
├─┼────────────────────────────────────────────────────────────────────┤
│1│Undefined. │
├─┼────────────────────────────────────────────────────────────────────┤
│2│The LSEG is public and may be combined with other LSEGs of the same │
│ │name. │
├─┼────────────────────────────────────────────────────────────────────┤
│3│Undefined. │
├─┼────────────────────────────────────────────────────────────────────┤
│4│The LSEG is public and may be combined with other LSEGs of the same │
│ │name. │
├─┼────────────────────────────────────────────────────────────────────┤
│5│The LSEG is a stack segment and may be combined with other LSEGs of │
│ │the same name. │
├─┼────────────────────────────────────────────────────────────────────┤
│6│The LSEG is a common segment and must be combined with other LSEGs │
│ │of the same name. │
├─┼────────────────────────────────────────────────────────────────────┤
│7│The LSEG is public and may be combined with other LSEGs of the same │
│ │name. │
└─┴────────────────────────────────────────────────────────────────────┘
T-MODULE Record Format -- Notes on Combining
LSEGs which can be combined and are not common are combined as follows:
┌─────────────────────────────────┐
│LSEG data for first T-MODULE │
├─────────────────────────────────┤
│Alignment Gap for second T-MODULE│
├─────────────────────────────────┤
│LSEG data for second T-MODULE │
├─────────────────────────────────┤
│ /// │
├─────────────────────────────────┤
│Alignment Gap for last T-MODULE │
├─────────────────────────────────┤
│LSEG data for last T-MODULE │
└─────────────────────────────────┘
The resultant PSEG is the combined length of the above.
For common LSEGs, the length of the PSEG is the length of the largest
LSEG. Therefore, the length of the PSEG cannot be determined until all
the T-MODULEs are processed. For this reason, data cannot be loaded
into common segments until fixup time.
T-MODULE Record Format -- LSEG Definition (SEGDEF)
┌───┬──────┬─────────┬───────┬───────┬─────┬───────┬─────┐
│ │Record│ Segment │Segment│Segment│Class│Overlay│Check│
│90H│Length│Attribute│Length │ INDEX │INDEX│ INDEX │ Sum │
├───┼──────┼─────────┼───────┼───────┼─────┼───────┼─────┤
│ 1 │ 2 │ 1-4 │ 2 │ INDEX │INDEX│ INDEX │ 1 │
└───┴──────┴─────────┴───────┴───────┴─────┴───────┴─────┘
Segment Length --
If ACBP.B is 0, then this is the length of the LSEG. If ACBP.B is 1,
then the Segment Length must be 0 and the length of the LSEG is
65536 bytes.
Segment INDEX --
This is an INDEX into the LNAME list. LNAME[Segment INDEX] is the
name of the LSEG.
Class INDEX --
This is an INDEX into the LNAME list. LNAME[Class INDEX] is the name
of the LSEG.
Overlay INDEX --
This is an INDEX into the LNAME list. LNAME[Class INDEX] is the name
of the LSEG.
T-MODULE Record Format -- Group Definition (GRPDEF)
┌───┬──────┬───────────┬───┬─────┬─────┐
│ │Record│Group NAME │ │LSEG │Check│
│9AH│Length│ INDEX │FFH│INDEX│ Sum │
├───┼──────┼───────────┼───┼─────┼─────┤
│ 1 │ 2 │ INDEX │ 1 │INDEX│ 1 │
└───┴──────┴───────────┼───┴─────┼─────┘
└─repeated┘
Like the LNAME list and SEGDEF list, the GRPDEFs form a list (indexed
relative to 1) of groups. The is one GRPDEF record for each group in
the T-MODULE.
Group NAME INDEX --
LNAME[Group NAME INDEX] is the name of the group.
LSEG INDEX --
This field is repeated once for each LSEG in the group. The LSEG
INDEX is the INDEX into the LSEG list which corresponds to the LSEG
in the group.
T-MODULE Record Format -- Public Definition (PUBDEF)
┌───┬──────┬─────┬───────┬──────┬──────┬──────┬─────┬─────┐
│ │Record│Group│Segment│FRAME │Public│Public│Type │Check│
│90H│Length│INDEX│ INDEX │NUMBER│ NAME │Offset│INDEX│ Sum │
├───┼──────┼─────┼───────┼──────┼──────┼──────┼─────┼─────┤
│ 1 │ 2 │INDEX│ INDEX │ 0-2 │ NAME │ 2 │INDEX│ 1 │
└───┴──────┴─────┴───────┴──────┼──────┴──────┴─────┼─────┘
└─────repeated──────┘
Group INDEX --
If the public(s) are defined is an LSEG which is part of a group,
then this is the index into the group list.
Segment INDEX --
If the public(s) are defined in an LSEG, this is the index into the
LSEG list.
FRAME NUMBER --
This field is only present if the public(s) are absolute (indicated
by both Group INDEX and Segment INDEX being zero). When present,
this is the FRAME NUMBER used to reference the public(s).
T-MODULE Record Format -- Public Definition (PUBDEF)
┌───┬──────┬─────┬───────┬──────┬──────┬──────┬─────┬─────┐
│ │Record│Group│Segment│FRAME │Public│Public│Type │Check│
│90H│Length│INDEX│ INDEX │NUMBER│ NAME │Offset│INDEX│ Sum │
├───┼──────┼─────┼───────┼──────┼──────┼──────┼─────┼─────┤
│ 1 │ 2 │INDEX│ INDEX │ 0-2 │ NAME │ 2 │INDEX│ 1 │
└───┴──────┴─────┴───────┴──────┼──────┴──────┴─────┼─────┘
└─────repeated──────┘
Public NAME --
This is the name of the public.
Public Offset --
This is the distance of the start of the public from the group, LSEG
or FRAME.
Type INDEX --
This is ignored.
T-MODULE Record Format -- Public Definition (EXTDEF)
┌───┬──────┬────────┬─────┬─────┐
│ │Record│External│Type │Check│
│BCH│Length│ NAME │INDEX│ Sum │
├───┼──────┼────────┼─────┼─────┤
│ 1 │ 2 │ NAME │INDEX│ 1 │
└───┴──────┼────────┴─────┼─────┘
└───repeated───┘
Like the LNAMEs, SEGDEFs, and GRPDEFs, the EXTDEFs form a list (indexed
relative to 1) of the external names used in this T-MODULE.
External NAME --
This is the name of the external public symbol.
Type INDEX --
This is ignored.
T-MODULE Record Format -- Logical Enumerated Data (LEDATA)
┌───┬──────┬───────┬──────┬────────┬─────┐
│ │Record│Segment│ │ │Check│
│A0H│Length│ INDEX │Offset│ Data │ Sum │
├───┼──────┼───────┼──────┼────────┼─────┤
│ 1 │ 2 │ INDEX │ 2 │ 1 │ 1 │
└───┴──────┴───────┴──────┼────────┼─────┘
└repeated┘
Segment INDEX --
This data is to be loaded into the LSEG corresponding to SEGDEF list
entry SEGDEF[Segment INDEX].
Offset --
This data is to be loaded starting at this offset in the LSEG.
Data --
The byte(s) to be loaded. No more than 1024 can be loaded by an
LEDATA record.
T-MODULE Record Format -- Logical Iterated Data (LIDATA)
┌───┬──────┬───────┬──────┬──────────┬─────┐
│ │Record│Segment│ │ Iterated │Check│
│A2H│Length│ INDEX │Offset│Data Block│ Sum │
├───┼──────┼───────┼──────┼──────────┼─────┤
│ 1 │ 2 │ INDEX │ 2 │ variable │ 1 │
└───┴──────┴───────┴──────┼──────────┼─────┘
└─repeated─┘
Segment INDEX --
This data is to be loaded into the LSEG corresponding to SEGDEF list
entry SEGDEF[Segment INDEX].
Offset --
This data is to be loaded starting at this offset in the LSEG.
Iterated Data Block --
This is (recursively) defined later, but the total size cannot exceed
1024 bytes.
T-MODULE Record Format -- Logical Iterated Data (LIDATA)
Iterated Data Block: ┌──────┬─────┬────────┐
│Repeat│Block│ │
│Count │Count│ Content│
├──────┼─────┼────────┤
│ 2 │ 2 │variable│
└──────┴─────┴────────┘
Repeat Count --
If Block Count is zero then Content is interpreted as a string of
bytes of length Repeat Count. If Block Count is zero, then this is
the number of times the Content field is repeated.
Block Count --
If this is zero, then the Content field is interpreted as a string of
bytes of length Repeat Count. If this is non-zero, then the Content
field contains a string of Block Count Iterated Data Blocks.
Content --
This is either a string of bytes as described above or it is a string
of Iterated Data Blocks.
T-MODULE Record Format -- Fixup Record (FIXUPP)
┌───┬──────┬────────┬─────┐
│ │Record│ Thread │Check│
│9CH│Length│or Fixup│ Sum │
├───┼──────┼────────┼─────┤
│ 1 │ 2 │variable│ 1 │
└───┴──────┼────────┼─────┘
└repeated┘
Thread or Fixup --
This field can be either a Thread (high order bit is 0) or Fixup
(high order bit is 1). A Thread is a default TARGET or FRAME method.
There are four TARGET threads and four FRAME threads. Thread fields
are used to store the default TARGET or FRAME method. A Fixup type
field specifies the five pieces of information (discussed earlier)
necessary to perform a fixup.
T-MODULE Record Format -- Thread
Thread: ┌──────┬───────┐ Thread Data: ┌─┬─┬─┬──────┬─────┐
│Thread│Thread │ │0│D│0│Method│Thred│
│ Data │ INDEX │ ├─┼─┼─┼──────┼─────┤
├──────┼───────┤ │1│1│1│ 3 │ 2 │
│ 1 │0-INDEX│ ├─┴─┴─┴──────┴─────┤
└──────┴───────┘ │ byte │
└──────────────────┘
D --
If D is zero then a TARGET thread is being specified, otherwise a
FRAME thread is being specified.
Method --
This is the TARGET or FRAME method. For TARGET threads, only the
four primary methods are specified. All seven FRAME methods can be
specified.
Thred --
This is the TARGET or FRAME thread number being specified.
Thread INDEX --
This is not present when F4, F5 or F6 is being specified. In all
other cases, this is either a Segment, Group or External index
depending on the Method.
T-MODULE Record Format -- Fixup
┌─────┬───────┬───────┬───────┬──────┐
│LOCAT│ Fixup │ FRAME │TARGET │TARGET│
│ │Methods│ INDEX │INDEX │Offset│
├─────┼───────┼───────┼───────┼──────┤
│ 2 │ 1 │0-INDEX│0-INDEX│ 0-2 │
└─────┴───────┴───────┴───────┴──────┘
LOCAT: ┌─┬────┬─┬────────┬─────────┐
│ │ │ │LOCATION│LE/LIDATA│
│1│Mode│0│ Type │ Offset │
├─┼────┼─┼────────┼─────────┤
│1│ 1 │1│ 3 │ 10 │
├─┴────┴─┴────────┴─────────┤
│low byte high byte│ Note: Low and high bytes
└───────────────────────────┘ are swapped.
Mode --
If Mode is 0 then this is a self-relative fixup, otherwise it is a
segment-relative fixup. Self-relative fixups on LIDATA are not
permitted.
T-MODULE Record Format -- Fixup
LOCAT: ┌─┬────┬─┬────────┬─────────┐
│ │ │ │LOCATION│LE/LIDATA│
│1│Mode│0│ Type │ Offset │
├─┼────┼─┼────────┼─────────┤
│1│ 1 │1│ 3 │ 10 │
├─┴────┴─┴────────┴─────────┤
│low byte high byte│
└───────────────────────────┘
LOCATION Type: ┌────────┬────────────────┐
│LOCATION│Type Description│
├────────┼────────────────┤
│ 0 │ LOBYTE │
├────────┼────────────────┤
│ 1 │ OFFSET │
├────────┼────────────────┤
│ 2 │ BASE │
├────────┼────────────────┤
│ 3 │ POINTER │
├────────┼────────────────┤
│ 4 │ HIBYTE │
└────────┴────────────────┘
T-MODULE Record Format -- Fixup
LOCAT: ┌─┬────┬─┬────────┬─────────┐
│ │ │ │LOCATION│LE/LIDATA│
│1│Mode│0│ Type │ Offset │
├─┼────┼─┼────────┼─────────┤
│1│ 1 │1│ 3 │ 10 │
├─┴────┴─┴────────┴─────────┤
│low byte high byte│
└───────────────────────────┘
LE/LIDATA Offset --
This field is used to determine the LOCATION information for the
fixup. This offset is actually an offset into the last LEDATA or
LIDATA record. The LEDATA or LIDATA contains the base LSEG and
offset information. Note that for LIDATA records, each time the data
at LE/LIDATA Offset is repeated, the fixup must occur.
T-MODULE Record Format -- Fixup
┌─────┬───────┬───────┬───────┬──────┐
│LOCAT│ Fixup │ FRAME │TARGET │TARGET│
│ │Methods│ INDEX │INDEX │Offset│
├─────┼───────┼───────┼───────┼──────┤
│ 2 │ 1 │0-INDEX│0-INDEX│ 0-2 │
└─────┴───────┴───────┴───────┴──────┘
Fixup Methods: ┌─┬─────┬─┬─┬──────┐
│F│FRAME│T│P│TARGET│ F --
├─┼─────┼─┼─┼──────┤ If F is 1 then FRAME is a thread,
│1│ 3 │1│1│ 2 │ else FRAME is the FRAME method.
├─┴─────┴─┴─┴──────┤
│ byte │ T --
└──────────────────┘ If T is 1 then TARGET is a thread,
else TARGET is the TARGET method.
FRAME --
This is either the FRAME method (F=0) or a FRAME thread (F=1).
TARGET --
This is either the TARGET method (T=0) or a TARGET thread (T=1).
P --
If P=0 then the primary TARGET methods are used and the TARGET Offset
field will be present.
T-MODULE Record Format -- Fixup
┌─────┬───────┬───────┬───────┬──────┐
│LOCAT│ Fixup │ FRAME │TARGET │TARGET│
│ │Methods│ INDEX │INDEX │Offset│
├─────┼───────┼───────┼───────┼──────┤
│ 2 │ 1 │0-INDEX│0-INDEX│ 0-2 │
└─────┴───────┴───────┴───────┴──────┘
FRAME INDEX --
Depending on the FRAME method, this is either a Segment, Group, or
External INDEX. This will be present only when a FRAME thread is not
used (F=0).
TARGET INDEX --
Depending on the TARGET method, this is either a Segment, Group, or
External INDEX. This will be present only when a TARGET thread is
not used (T=0).
TARGET Offset --
The TARGET is TARGET Offset bytes from the Segment, Group, or
External given by TARGET INDEX.
T-MODULE Record Format -- T-MODULE End (MODEND)
┌───┬──────┬────┬────────┬─────┐
│ │Record│End │ Start │Check│
│8AH│Length│Type│Address │ Sum │
├───┼──────┼────┼────────┼─────┤
│ 1 │ 2 │ 1 │variable│ 1 │
└───┴──────┴────┴────────┴─────┘
End Type: Attribute:
┌─────────┬─┬─┐ ┌─────────┬──────────────────────────┐
│Attribute│0│1│ │Attribute│Description │
├─────────┼─┼─┤ ├─────────┼──────────────────────────┤
│ 2 │5│1│ │ 0 │Non-main, no Start Address│
├─────────┴─┴─┤ ├─────────┼──────────────────────────┤
│ byte │ │ 1 │Non-main, Start Address │
└─────────────┘ ├─────────┼──────────────────────────┤
│ 2 │Main, no Start Address │
├─────────┼──────────────────────────┤
│ 3 │Main, Start Address │
└─────────┴──────────────────────────┘
T-MODULE Record Format -- T-MODULE End (MODEND)
Start Address: ┌───────┬───────┬───────┬──────┐
│ Fixup │ FRAME │TARGET │TARGET│
│Methods│ INDEX │INDEX │Offset│
├───────┼───────┼───────┼──────┤
│ 1 │0-INDEX│0-INDEX│ 0-2 │
└───────┴───────┴───────┴──────┘
The above fields work exactly the same as they do in a FIXUPP record.
The start address is computed from the FRAME and TARGET specified
above. The initial CS is the FRAME NUMBER of the FRAME, and the
initial IP is TARGET - FRAME.
T-MODULE Record Format -- Comment Record (COMENT)
┌───┬──────┬───────┬────────┬─────┐
│ │Record│Comment│ │Check│
│88H│Length│ Type │Comment │ Sum │
├───┼──────┼───────┼────────┼─────┤
│ 1 │ 2 │ 2 │variable│ 1 │
└───┴──────┴───────┴────────┴─────┘
Comment Type: ┌─────┬────┬─┬─────┐
│Purge│List│0│Class│
├─────┼────┼─┼─────┤
│ 1 │ 1 │6│ 8 │
├─────┴────┴─┴─────┤
│ word │
└──────────────────┘
Purge --
COMENT record should not be deleted by utilities which can delete
comments (Purge=1).
List --
COMENT record should not be listed by utilities which can list
comments (LIST=1).
T-MODULE Record Format -- Comment Record (COMENT)
┌─────┬────────────────────────────────────────────────────────────────┐
│Class│ Description │
├─────┼────────────────────────────────────────────────────────────────┤
│ 129 │Do not do a default library search. │
├─────┼────────────────────────────────────────────────────────────────┤
│ 157 │Memory model information is in the Comment field. │
├─────┼────────────────────────────────────────────────────────────────┤
│ 158 │Use the "DOSSEG" ordering. │
├─────┼────────────────────────────────────────────────────────────────┤
│ 159 │Library name is in the Comment field. │
├─────┼────────────────────────────────────────────────────────────────┤
│ 161 │Codeview (registered trademark of Microsoft) information is │
│ │present. │
├─────┼────────────────────────────────────────────────────────────────┤
│ 162 │Pass 1 of linker can stop processing T-MODULE here. │
└─────┴────────────────────────────────────────────────────────────────┘
T-MODULE Record Format -- Communal Definition (COMDEF)
┌───┬──────┬──────┬─────┬───────────┬──────────┬─────┐
│ │ │Symbol│Type │Far or Near│ Communal │Check│
│B0H│Length│ NAME │INDEX│ Communal │ Size │ Sum │
├───┼──────┼──────┼─────┼───────────┼──────────┼─────┤
│ 1 │ 2 │ NAME │INDEX│ 1 │ variable │ 1 │
└───┴──────┴──────┴─────┴───────────┴──────────┴─────┘
Symbol NAME --
This is the name of the communal symbol. It is placed in the list of
external symbols just as if it were an EXTDEF record. If a PUBDEF
record with the same Symbol NAME is encountered, it overrides the
COMDEF.
Type INDEX --
This is ignored.
Far or Near Communal --
Far communals have a 61H coded here. Near communals have a 62H coded
here.
The ultimate size of the communal is the largest communal.
T-MODULE Record Format -- Communal Definition (COMDEF)
For near communals, Communal Size is: ┌─────┐
│SIZE │
│VALUE│
├─────┤
│VALUE│
└─────┘
For far communals, Communal Size is: ┌─────┬─────┐
│COUNT│SIZE │
The size of the communal is: │VALUE│VALUE│
├─────┼─────┤
COUNT VALUE * SIZE VALUE │VALUE│VALUE│
└─────┴─────┘
Near communals go in DGROUP. Far communals go in HUGE_BSS and are
packed as compactly as possible into PSEGs of no more than 64K.
T-MODULE Record Format -- Forward Reference Fixups (FORREF)
┌───┬──────┬───────┬────┬──────┬────────┬─────┐
│ │ │Segment│ │ │ Fixup │Check│
│B2H│Length│ INDEX │Size│Offset│ Data │ Sum │
├───┼──────┼───────┼────┼──────┼────────┼─────┤
│ 1 │ 2 │ INDEX │ 1 │ 2 │variable│ 1 │
└───┴──────┴───────┴────┼──────┴────────┼─────┘
└───repeated────┘
Segment INDEX --
The Forward Reference Fixup is to be applied to the LSEG element
whose index into the SEGDEF list is Segment INDEX.
Size --
This specifies the size of the Fixup Data fields. The Fixup Data
fields are a byte when Size = 0, a word when Size = 1, or a DWORD
when Size = 2.
Offset --
This is the Offset into the LSEG specified by Segment Index where the
fixup is applied.
Fixup Data --
This value is added at the specified Offset.
Note: The FORREF record may occur before the LE/LIDATA records which
load data into the LSEG. Therefore, FORREFs must be applied at
fixup time.
T-MODULE Record Format -- Local External Definition (MODEXT)
┌───┬──────┬────────┬─────┬─────┐
│ │Record│External│Type │Check│
│B4H│Length│ NAME │INDEX│ Sum │
├───┼──────┼────────┼─────┼─────┤
│ 1 │ 2 │ NAME │INDEX│ 1 │
└───┴──────┼────────┴─────┼─────┘
└───repeated───┘
The fields of the MODEXT record function just like the EXTDEF record
except that the external is local to this T-MODULE only. The External
NAME is included in the list of externals.
T-MODULE Record Format -- Local Public Definition (MODPUB)
┌───┬──────┬─────┬───────┬──────┬──────┬──────┬─────┬─────┐
│ │Record│Group│Segment│FRAME │Public│Public│Type │Check│
│B6H│Length│INDEX│ INDEX │NUMBER│ NAME │Offset│INDEX│ Sum │
├───┼──────┼─────┼───────┼──────┼──────┼──────┼─────┼─────┤
│ 1 │ 2 │INDEX│ INDEX │ 0-2 │ NAME │ 2 │INDEX│ 1 │
└───┴──────┴─────┴───────┴──────┼──────┴──────┴─────┼─────┘
└─────repeated──────┘
The fields of the MODPUB record function just like the PUBDEF record
except that the public symbol is local to this T-MODULE only.
T-MODULE Record Format -- Line Number (LINNUM)
┌───┬──────┬──────────┬─────┐
│ │Record│ │Check│
│94H│Length│ Data │ Sum │
├───┼──────┼──────────┼─────┤
│ 1 │ 2 │ 1 │ 1 │
└───┴──────┼──────────┼─────┘
└─repeated─┘
The LINNUM record is ignored by the linker.
T-MODULE Record Format -- Type Definition (TYPDEF)
┌───┬──────┬──────────┬─────┐
│ │Record│ │Check│
│8EH│Length│ Data │ Sum │
├───┼──────┼──────────┼─────┤
│ 1 │ 2 │ 1 │ 1 │
└───┴──────┼──────────┼─────┘
└─repeated─┘
The TYPDEF record is ignored by the linker.
T-MODULE Record Format -- Record Order
Object modules are parsed via recursive descent as defined below:
t_module:: THEADR seg_grp {component} modtail
seg_grp:: {LNAMES | SEGDEF | EXTDEF} {TYPDEF | EXTDEF | GRPDEF}
component:: data | debug_record
data:: content_def | thread_def | COMDEF | TYPDEF | PUBDEF |
EXTDEF | FORREF | MODPUB | MODEXT
debug_record:: LINNUM
content_def:: data_record {FIXUPP}
thread_def:: FIXUPP (containing only thread fields)
data_record:: LIDATA | LEDATA
modtail:: MODEND
Primary Internal Data Structure
┌────────────┐ ┌─────────┐ ┌──────────────────┐
│ Segment #1 ├────────────>│ LSEG #1 ├─────────────>│ LSEG #1 Contents │
└─────┬──────┘ └────┬────┘ └──────────────────┘
│ │
│ ^
│ ┌─────────┐ ┌──────────────────┐
│ │ LSEG #2 ├─────────────>│ LSEG #2 Contents │
│ └────┬────┘ └──────────────────┘
│ ^
│ ///
^
┌────────────┐ ┌─────────┐ ┌──────────────────┐
│ Segment #2 ├────────────>│ LSEG #1 ├─────────────>│ LSEG #1 Contents │
└─────┬──────┘ └────┬────┘ └──────────────────┘
│ │
│ ^
│ ┌─────────┐ ┌──────────────────┐
^ │ LSEG #2 ├─────────────>│ LSEG #2 Contents │
/// └────┬────┘ └──────────────────┘
^
///
Temp File
The linker employs a temp file to save information which can only be
processed after all the T-MODULEs have been processed. The information
which must be saved is:
Fixups
LE/LIDATA for common LSEGS
FORREF records
The temp file is deleted when processing is complete.
Library File Format
library_file:: header_page {t_modules} trailer_page {directory_pages}
header_page :: ┌───┬──────┬─────────┬─────────┬──────────┬─┐
│ │Record│Directory│Directory│ │ │
│F0H│Length│ Offset │ Pages │ Pad │0│
├───┼──────┼─────────┼─────────┼──────────┼─┤
│ 1 │ 2 │ 4 │ 2 │ 1 │1│
└───┴──────┴─────────┴─────────┼──────────┼─┘
(prime) └─repeated─┘
t_modules :: The t_modules are as described above except a pad is
added after the MODEND record to make the t_module
occupy a full page. The page size is the header_page
Record Length + 3.
trailer_page :: ┌───┬──────┬──────────┬─┐
│ │Record│ │ │
│F1H│Length│ Pad │0│
├───┼──────┼──────────┼─┤
│ 1 │ 2 │ 1 │1│
└───┴──────┼──────────┼─┘
└─repeated─┘
Library File Format -- Directory
directory_pages :: public_pointer_array {public_entry} pad
Notes: A directory page is always 512 bytes. A directory page can
contain up to 37 public entries.
public_pointer_array --
This is a 38 byte array which is used to point into the public_entry
field. To determine where public i is located in the directory page,
take the ith byte of the public_pointer_array (relative 0) and
multiple it by 2. That byte will be the beginning of the
public_entry for ith public in the directory. The 38th entry is used
to point to the beginning of the free space in directory page.
public_entry :: ┌───────┬────────┐
│ Public│Starting│
│ NAME │ Page │
├───────┼────────┤
│ NAME │ 2 │
└───────┴────────┘
Library File Format -- Finding a Public
The library directory employs a two-tiered hashing scheme to store
public names in its directory. A detailed description of the algorithm
is given later, but for now the following general aspects of the
algoritm are useful. To start the search, you need to know which
directory page to start searching, and if you don't find it in that
page, which directory page to search next. Once in a directory page,
you have to know which entry to use to begin the search and which entry
to search next if it was not found.
We will call the four required values STARTING_PAGE, DELTA_PAGE,
STARTING_ENTRY, and DELTA_ENTRY. The detail on how to compute these
values is give later.
Start with directory page STARTING_PAGE. On that page, examine
public_entry STARTING_ENTRY. There are three cases. This could be the
public symbol you desire, in which case you are done. The
public_pointer_array for this entry could be zero, in which case the
symbol is not in the library. Or, this the public symbol at
STARTING_ENTRY could be some other public symbol. In this case, add
DELTA_ENTRY (modulo 37) to the STARTING_ENTRY and examine that public
entry. Since there are at most 37 entries in any directory page,
examine no more than 37 entries in any given page. If you have tried
all entries on a page, proceed to the next page by adding DELTA_PAGE
(modulo Directory_Pages) to STARTING_PAGE and continue the process.
When you move to a new page, continue processing the public entries
where you left off.
To compute the STARTING_PAGE, DELTA_PAGE, STARTING_ENTRY, and
DELTA_ENTRY, view a NAME field as if it were an array of bytes
containing the public name:
┌──────┬────┬────┬────┬─//─┬──────┐
NAME───────>│Length│byte│byte│byte│ │ byte │
├──────┼────┼────┼────┼─//─┼──────┤
│ 1 │ 1 │ 1 │ 1 │ │ 1 │
├──────┼────┼────┼────┼─//─┼──────┤
index──────>│ 0 │ 1 │ 2 │ 3 │ │Length│
└──────┴────┴────┴────┴─//─┴──────┘
Then, the following code define the values:
STARTING_PAGE, DELTA_PAGE, STARTING_ENTRY, DELTA_ENTRY := 0;
for i := 0 .. Length-1
STARTING_PAGE := STARTING_PAGE+(NAME[i] or 20H) xor (<<STARTING_PAGE);
DELTA_PAGE := DELTA_PAGE+(NAME[Length-i+1] or 20H) xor (<<DELTA_PAGE);
STARTING_ENTRY:= STARTING_ENTRY+(NAME[Length-i+1] or 20H)
xor (>>STARTING_ENTRY);
DELTA_ENTRY := DELTA_ENTRY+(NAME[i] or 20H) xor (>>DELTA_ENTRY);
end for;
if DELTA_ENTRY = 0 then DELTA_ENTRY := 1;
if DELTA_PAGE = 0 then DELTA_PAGE := 1;
Note: << is circular shift left twice and
>> is circular shift right twice.
½